Emotion in Code-switching Texts: Corpus Construction and Analysis
نویسندگان
چکیده
Previous researches have focused on analyzing emotion through monolingual text, when in fact bilingual or code-switching posts are also common in social media. Despite the important implications of code-switching for emotion analysis, existing automatic emotion extraction methods fail to accommodate for the code-switching content. In this paper, we propose a general framework to construct and analyze the code-switching emotional posts in social media. We first propose an annotation scheme to identify the emotions associated with the languages expressing them in a Chinese-English code-switching corpus. We then make some observations and generate statistics from the corpus to analyze the linguistic phenomena of code-switching texts in social media. Finally, we propose a multiple-classifier-based automatic detection approach to detect emotion in the codeswitching corpus for evaluating the effectiveness of both Chinese and English texts.
منابع مشابه
Emotion Detection in Code-switching Texts via Bilingual and Sentimental Information
Code-switching is commonly used in the free-form text environment, such as social media, and it is especially favored in emotion expressions. Emotions in codeswitching texts differ from monolingual texts in that they can be expressed in either monolingual or bilingual forms. In this paper, we first utilize two kinds of knowledge, i.e. bilingual and sentimental information to bridge the gap betw...
متن کاملCultural Influence on the Expression of Cathartic Conceptualization in English and Spanish: A Corpus-Based Analysis
This paper investigates the conceptualization of emotional release from a cognitive linguistics perspective (Cognitive Metaphor Theory). The metaphor weeping is a means of liberating contained emotions is grounded in universal embodied cognition and is reflected in linguistic expressions in English and Spanish. Lexicalization patterns which encapsulate this conceptualization i...
متن کاملDetecting Code-Switching in a Multilingual Alpine Heritage Corpus
This paper describes experiments in detecting and annotating code-switching in a large multilingual diachronic corpus of Swiss Alpine texts. The texts are in English, French, German, Italian, Romansh and Swiss German. Because of the multilingual authors (mountaineers, scientists) and the assumed multilingual readers, the texts contain numerous code-switching elements. When building and annotati...
متن کاملEN-ES-CS: An English-Spanish Code-Switching Twitter Corpus for Multilingual Sentiment Analysis
Code-switching texts are those that contain terms in two or more different languages, and they appear increasingly often in social media. The aim of this paper is to provide a resource to the research community to evaluate the performance of sentiment classification techniques on this complex multilingual environment, proposing an English-Spanish corpus of tweets with code-switching (EN-ES-CS C...
متن کاملVocabulary Lists for EAP and Conversation Students
Despite the abundance of research investigating general and academic vocabularies and developing dozens of word lists, few studies have compared academic vocabulary with general service word lists such as conversation vocabulary. Many EAP researchers assume that university students need to know all the words in West’s (1953) General Service List (GSL) as a prerequisite to academic words (e.g., ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015